-
Notifications
You must be signed in to change notification settings - Fork 3.8k
fix: fix a race condition in the tracker #19513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
5b34fb2 to
b3d4ad5
Compare
b3d4ad5 to
2301021
Compare
This commit fixes a race condition in the push tracker where both the p.done and p.err channels could have a value at the same time. This could cause the distributor to return success when it should instead have returned an error.
2301021 to
5dc2cd5
Compare
|
Claude finished @grobinson-grafana's task —— View job Code Review Complete ✅I've thoroughly reviewed this PR that fixes a race condition in the push tracker. Here's my detailed analysis: Review Checklist:
SummaryThis is an excellent fix that addresses a critical race condition in the distributor's push tracking mechanism. The change replaces an unsafe channel-based approach with a proper mutex-protected state machine. Race Condition Analysis ⭐The Problem: The original // OLD CODE - UNSAFE
func (p *pushTracker) doneWithResult(err error) {
if err == nil {
if p.streamsPending.Dec() == 0 {
p.done <- struct{}{} // ← Could send here
}
} else {
if p.streamsFailed.Inc() == 1 {
p.err <- err // ← AND here simultaneously
}
}
}The The Fix: The new
Code Quality Assessment✅ Strengths
✅ Test CoverageThe test suite is comprehensive and well-designed:
The fuzz test is particularly valuable for catching race conditions. Technical ImplementationCorrect Synchronization Patternfunc (t *basicPushTracker) Wait(ctx context.Context) error {
t.mtx.Lock()
if t.firstErr != nil || t.n == 0 {
res := t.firstErr
t.mtx.Unlock()
return res // Fast path - no channel operations needed
}
t.mtx.Unlock()
select {
case <-ctx.Done():
return ctx.Err()
case <-t.doneCh:
return t.firstErr // No mutex needed - immutable after close
case <-t.errCh:
return t.firstErr // No mutex needed - immutable after close
}
}This pattern is correct because:
Memory SafetyThe implementation correctly handles the fact that once either channel is closed, the Performance ConsiderationsPositive impacts:
Neutral impact:
Security Assessment
Integration AnalysisThe changes integrate well with the existing codebase:
Minor Suggestions
Verdict: APPROVED ✅This is a high-quality fix that:
The implementation demonstrates solid understanding of Go concurrency patterns and the fix directly addresses the core issue described in the PR. |
What this PR does / why we need it:
This commit fixes a race condition in the push tracker where both the p.done and p.err channels could have a value at the same time. This could cause the distributor to return success when it should instead have returned an error.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Checklist
CONTRIBUTING.mdguide (required)featPRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.mddeprecated-config.yamlanddeleted-config.yamlfiles respectively in thetools/deprecated-config-checkerdirectory. Example PR